Functionality-Based Web Image Categorization

نویسندگان

  • Jianying Hu
  • Amit Bagga
چکیده

The World Wide Web provides an increasingly powerful and popular publication mechanism. Web documents often contain a large number of images serving various different purposes. Identifying the functional categories of these images has important applications including information extraction, web mining, web page summarization and mobile access. This paper describes a study on the functional categorization of Web images using data collected from news web sites. We describe the image categories found in such web pages and their distributions, identify the main research issues involved in automatically classifying images into these categories, and present a novel algorithm for automatic identification of two of the most important image categories, namely story and preview images.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Image flip CAPTCHA

The massive and automated access to Web resources through robots has made it essential for Web service providers to make some conclusion about whether the "user" is a human or a robot. A Human Interaction Proof (HIP) like Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA) offers a way to make such a distinction. CAPTCHA is a reverse Turing test used by Web serv...

متن کامل

Supervised Categorization of JavaScriptTM Using Program Analysis Features

Web pages often embed scripts for a variety of purposes, including advertising and dynamic interaction. Understanding embedded scripts and their purpose can often help to interpret or provide crucial information about the web page. We have developed a functionality-based categorization of JavaScript, the most widely used web page scripting language. We then view understanding embedded scripts a...

متن کامل

Exploiting Privileged Information from Web Data for Image Categorization

Relevant and irrelevant web images collected by tag-based image retrieval have been employed as loosely labeled training data for learning SVM classifiers for image categorization by only using the visual features. In this work, we propose a new image categorization method by incorporating the textual features extracted from the surrounding textual descriptions (tags, captions, categories, etc....

متن کامل

Refining Image Categorization by Exploiting Web Images and General Corpus

Studies show that refining real-world categories into semantic subcategories contributes to better image modeling and classification. Previous image sub-categorization work relying on labeled images and WordNet’s hierarchy is not only laborintensive, but also restricted to classify images into NOUN subcategories. To tackle these problems, in this work, we exploit general corpus information to a...

متن کامل

Non-photographic Image Categorization

The rapid growth of IT industry today has undoubtedly boosted the widespread use of computer images in both web pages and modern computer programs. Applications like online image search, automatic webpage summarization and web mining, rely heavily on image categorization. This project presents a system that categorizes non-photographic images 1 based on their textual and image features. The cor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003